Crowd Labeling: a survey

نویسندگان

  • Jafar Muhammadi
  • Hamid Reza Rabiee
  • Abbas Hosseini
چکیده

Crowd computing empowers computer systems by utilizing humans’ perception, and their ability to solve non-algorithmic problems. In this approach, a group of humans are asked to contributively solve a problem that cannot be solved easily by individuals, or perfectly by computers. However, there are complexities in using humans to solve problems. Lack of generative models, complex cost models, lower speed in comparison to computers, limitation of knowledge and skills, noise, bias and error are examples of such complexities. An optimized crowd computing system should overcome these complexities, and improve the quality of solutions. This paper includes answers to three main questions: What is crowd computing? Why should one use crowd computing? And, how to use crowd computing? We will briefly answer the two former questions, while we will focus more on the latter one, specially on solving classification problems using multiple checking scenario. In addition, we will compare the current methods of crowed computing, and provide some guidelines for future works based on the current open issues in this field.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward a Robust Crowd-labeling Framework using Expert Evaluation and Pairwise Comparison

Crowd-labeling emerged from the need to label large-scale and complex data, a tedious, expensive, and time-consuming task. One of the main challenges in the crowd-labeling task is to control for or determine in advance the proportion of low-quality/malicious labelers. If that proportion grows too high, there is often a phase transition leading to a steep, non-linear drop in labeling accuracy as...

متن کامل

Quality Control of Crowd Labeling through Expert Evaluation

We propose a general scheme for quality-controlled labeling of large-scale data using multiple labels from the crowd and a “few” ground truth labels from an expert of the field. Expert-labeled instances are used to assign weights to the expertise of each crowd labeler and to the difficulty of each instance. Ground truth labels for all instances are then approximated through those weights along ...

متن کامل

Sembler: Ensembling Crowd Sequential Labeling for Improved Quality

Many natural language processing tasks, such as named entity recognition (NER), part of speech (POS) tagging, word segmentation, and etc., can be formulated as sequential data labeling problems. Building a sound labeler requires very large number of correctly labeled training examples, which may not always be possible. On the other hand, crowdsourcing provides an inexpensive yet efficient alter...

متن کامل

Robust Crowd Labeling Using Little Expertise

Crowd-labeling emerged from the need to label large-scale and complex data, a tedious, expensive, and time-consuming task. But the problem of obtaining good quality labels from a crowd and their integration is still unresolved. To address this challenge, we propose a new framework that automatically combines and boosts bulk crowd labels supported by limited number of “ground truth” labels from ...

متن کامل

Speeding up Crowds for Low-latency Data Labeling

Data labeling is a necessary but often slow process that impedes the development of interactive systems for modern data analysis. Despite rising demand for manual data labeling, there is a surprising lack of work addressing its high and unpredictable latency. In this paper, we introduce CLAMShell, a system that speeds up crowds in order to achieve consistently low-latency data labeling. We o↵er...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013